Longest Common Subsequence from Fragmentsvia Sparse Dynamic

نویسنده

  • Brenda S. Baker
چکیده

Sparse Dynamic Programming has emerged as an essential tool for the design of eecient algorithms for optimization problems coming from such diverse areas as Computer Science, Computational Biology and Speech Recognition 7, 11, 15]. We provide a new Sparse Dynamic Programming technique that extends the Hunt-Szymanski 2, 9, 8] paradigm for the computation of the Longest Common Subsequence (LCS) and apply it to solve the LCS from Fragments problem: given a pair of strings X and Y (of length n and m, resp.) and a set M of matching substrings of X and Y , nd the longest common subse-quence based only on the symbol correspondences induced by the sub-strings. This problem arises in an application to analysis of software systems. Our algorithm solves the problem in O(jMj log jM j) time using balanced trees, or O(jMj log log min(jMj; nm=jMj)) time using John-son's version of Flat Trees 10]. These bounds apply for two cost measures. The algorithm can also be adapted to nding the usual LCS in O((m + n) log jj + jM j log jM j) using balanced trees or O((m + n) log jj + jM j log log min(jMj; nm=jMj)) using Johnson's Flat Trees, where M is the set of maximal matches between substrings of X and Y and is the alphabet.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient algorithms for the longest common subsequence in $k$-length substrings

Finding the longest common subsequence in k-length substrings (LCSk) is a recently proposed problem motivated by computational biology. This is a generalization of the well-known LCS problem in which matching symbols from two sequences A and B are replaced with matching non-overlapping substrings of length k from A and B. We propose several algorithms for LCSk, being non-trivial incarnations of...

متن کامل

Sparse Dynamic Programming for Longest Common Subsequence from Fragments

Sparse Dynamic Programming has emerged as an essential tool for the design of efficient algorithms for optimization problems coming from such diverse areas as computer science, computational biology, and speech recognition. We provide a new sparse dynamic programming technique that extends the Hunt–Szymanski paradigm for the computation of the longest common subsequence (LCS) and apply it to so...

متن کامل

New Tabulation and Sparse Dynamic Programming Based Techniques for Sequence Similarity Problems

Calculating the length of a longest common subsequence (LCS) of two strings A and B of length n andm is a classic research topic, with many worst-case oriented results known. We present two algorithms for LCS length calculation with respectively O(mn log log n/ log n) and O(mn/ log n+r) time complexity, the latter working for r = o(mn/(log n log log n)), where r is the number of matches in the ...

متن کامل

A simple algorithm for the constrained sequence problems

In this paper we address the constrained longest common subsequence problem. Given two sequences X , Y and a constrained sequence P , a sequence Z is a constrained longest common subsequence for X and Y with respect to P if Z is the longest subsequence of X and Y such that P is a subsequence of Z. Recently, Tsai [7] proposed an O(n ·m · r) time algorithm to solve this problem using dynamic prog...

متن کامل

The constrained longest common subsequence problem

This paper considers a constrained version of longest common subsequence problem for two strings. Given strings S1, S2 and P , the constrained longest common subsequence problem for S1 and S2 with respect to P is to find a longest common subsequence lcs of S1 and S2 such that P is a subsequence of this lcs. An O(rn 2m2) time algorithm based upon the dynamic programming technique is proposed for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007